Fast method for force computations in electronic structure calculations 
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We present new efficient {0{N log A^)) methods for computing three quantities crucial to electronic 
structure calculations: the ionic potential, the electron-ion contribution to the Born-Oppenheimer 
forces, and the electron-ion contribution to the stress tensor. The present methods are applicable 
to calculations in which the electronic charge density is represented on a uniform grid in real space. 
They are particularly well-suited for metallic extended systems, where other 0{N) methodologies 
are not readily applicable. Based on a fast algorithm for determining the atomic structure factor, 
originally developed by Essmann et al. Q for fast Ewald energy and force computation, the present 
methods involve approximations that can be systematically improved. The methods are tested on a 
representative metallic system (bulk Al), and their ability to simultaneously achieve high accuracy 
and efficiency is demonstrated. 

PACS numbers: 71.15.-m, 71.15.Dx 



A wealth of efficient methods have recently been devel- 
oped [llllli,|lllllll|lS|ll|l2for calculating elec- 
tronic properties of an extended physical system that re- 
quire an amount of computation that scales linearly with 
the size of the system N. This size can be deffiied to be 
the number of atoms or the number of valence electrons, 
or the volume of the system, all of which are linearly 
related for large condensed systems. In this article we 
present a quasi-linear-scaling {0{N log N)) method for 
computing the ionic potential, the ionic forces, and the 
stress tensor in electronic structure calculations. Atomic 
forces are necessary for the calculation of many physical 
properties of a system, including the determination of 
the optimal structure and simulation at a finite tempera- 
ture. Some linear-scaling methods achieve linear compu- 
tational scaling for the computation of the energy but not 
for the forces on all of the ions|8|. Other linear-scaling 
methods achieve efficient force calculations by working 
with a basis of localized functions!^ 0, which are 
less efficient at representing delocalized electronic states 
found for example in metallic systems. The present 
method applies to calculations performed in a periodic 
parallelepiped supercell, not necessarily orthogonal, in 
which the electronic charge density p(r) is represented 
on a uniform grid. 

As a result of the Hellmann-FeynmanfTs', TS] theorem, 
the force on the pth ion (within the Born-Oppenheimer 
approximation) is given by the sum of the partial deriva- 
tives of the ion-ion energy (the Ewald energy) and the 
electron-ion energy with respect to the atomic coordi- 
nates: 



where 



dE 



Ewald 



(1) 



p(r)F*°"(r)dr 



and the ionic potential is defined as: 

Nat 

F*°"(r) = ^^'^(^ - tp - R) 

R p=l 
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where here the tp are the atomic positions within the 
unit cell; V''"'P{r) is the pseudopotential representing the 
ions; and the outer sum over R is over all lattice trans- 
lation vectors R = niSLi + n2B.2 + n^SL^ for all integers 
n.i, where the arc the lattice vectors defining the unit 
cell. For simplicity of presentation, we will consider sys- 
tems that involve only one type of pseudopotential, but 
the methods presented here generalize readily to systems 
with multiple pseudopotentials. Furthermore, non-local 
pseudopotentials can be split into a short-ranged non- 
local part and a long-ranged local part; only the long- 
ranged local part of V^p*p will be considered here. It 
should be noted that the present methods make only 
part of the electronic structure calculation efficient, and 
in the context of the Kohn-Sham method, the overall 
electronic structure calculation will still scale as iV'^ due 
to the need to orthogonalize N wavefunctions or to di- 
agonalize a.n N x N matrix. Thus the present methods 
are particularly relevant to the orbital-free density func- 
tional methods [TllIT^ . which deal only with local pseu- 
dopotentials and can achieve 0{N) scaling for the entire 
calculation. 

The Smooth Particle Mesh Ewald MethodQ (SPME) 
is an efficient scheme (O(A^logA^)) for computing the 
Ewald energy E^'^"-^'^ and its derivatives with respect 
to the atomic coordinates, dE^^"-'"'^ /dtp. Here we show 
how similar ideas can be used to yield efficient meth- 
ods (also 0{N log N)) for determining the ionic poten- 
tial y*°"(r), and the other component of the Born- 
Oppenheimer forces, dE'^~^ /dtp. 
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We begin presentation of our method by expressing 
V°"(r) in terms of the structure factor, S{r): 



T/»°"(r) = ^ y" ^"'^(i- - r')S{r')dr' 



where 



Nat 



%) = f7^^<5(r-tp-R) 



(4) 



(5) 



R p=l 



and n = \a.i ■ {3.2 x 33)! is the unit ceU volume. 

In reciprocal space, the expression for becomes: 



•trion 



1 f v'°"{r)e'^''dr 

^ J cell 



where 



and, using Eq. 

Sq ^ U 5(r)e^Q"-dr 

" J cell 
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Inversely, we have: 



Q 



(8) 
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where the Q-sum ranges over all integer multiples of the 
reciprocal lattice vectors hi defined by • hj — 2'K6ij . 

For the purposes of the present work, we assume that 
p{r) is represented on a iVi x A^2 x ^3 grid of points 
^hhh = j^^i + m_^2 + ]^a3, with = 0, ...,iV, - 1. 
We will denote quantities such as pi^i-^i^is) as h, h)- 
The Fourier transform of p(r) is approximated by: 

p(mibi + m2b2 + mabs) ~ T[p{li,l2, h)] (10) 

where J-' is the discrete Fourier transform: 

N1-1N2-1N3-1 

nfihj2,h)] = E E E /('i'^2,;3) 



h=0 l2=0 13=0 



X e ^ JVi + 7V2 + Ns I (11) 

where N = NiN2N3. Its inverse, is given by: 

Ni-l N2-I N3-I 

:F^^[f{mi,m2,m3)] = E E E /(™i,™2,m3) 

mi— 7712—0 7773—0 

0„„V il777i I i27T72 i Isms \ 

X e ^ N, + N2 + N3 \i2) 



Using Eq. lO, and transforming back to the real space 
grid, we can determine l/'°"(r) via: 



1 



n 



P (mi, 1^2,7713) 3(7711,1112,1^3) 



(13) 



where P(mi, TO2, ^3) is the array given by: 

P{7ni, 7112,7713) = VP'P {7n\hi + 77i'2h2 + 7n'^h3) (14) 

and 77i[ — 77ii for < < Ni/2 and m,^ = rrii — Ni 
otherwise. The array S'(mi, 7712,7713) is given by 

S'(7rJl, 7)7,2, 77I3) = 5(7)7,lbl + 7772b2 + 7)7,3b3) (15) 

Calculation of the structure factor via Eq. (jSl will 
scale with the square of the system size, because it needs 
to be computed at every reciprocal space grid point 
(1721,7712,7723), and at each of these points a sum must 
be performed over each atom in the system. However, 
Essmann et al., as part of their efficient Smooth Parti- 
cle Mesh Ewald method 1], provide an elegant method 
for computing the structure factor efficiently (albeit ap- 
proximately), requiring an amount of computation that 
scales as only NlogN. By incorporating this algorithm 
to compute the structure factor, the present methods 
for computing both the ionic potential y°"(r) and the 
electron-ion contribution to the forces F^"* achieve the 
same O(iVlogiV) quasi-linear scaling. 

Here we summarize the method of Essmann et al. for 
efficiently computing the structure factor, but refer the 
reader to the original reference Q for full details. The 
crux of the algorithm lies in the approximation of the ex- 
ponential in Eq. |(HJ) with (complex) cardinal B-splines. 
The Ttth order cardinal B-spline function, M„(x), is de- 
fined as follows: M2(x) = 1 — |a; — 1| for < x < 2, 
and M2{x) = otherwise; and the higher-order cardinal 
B-splines are defined recursively: 

X 71 — X 

Mn{x) = -M„_i(x) + -Mn-lix - 1) (16) 

77 — 1 77 —1 

We define the grid coordinates of the pth atom as Uip = 
Nitp ■ h„ or equivalcntly tp = + ^a2 + ^^3. 

The structure factor, expressed in terms of the grid co- 
ordinates, is: 



Nat 



.mi 



8(7711,7712,7713) = ^exp ( 27ri^wip 
p— 1 ^ ^ 



I "72 \ / 7(7,3 \ 

X exp I 27r7— 7i2p I exp I 27r7— U3p 1 (17) 

The exponentials can be approximated by Tith order 
cardinal B-splines, where 77 is even, as: 



exp ( 2m^Ujp ) ~ bj(77ij) ^ Mn(ujp - k) 



fe=-c 



X exp ( 2mj^kj (18) 
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where bj(mj) is: 
bj{mj) = 



exp(27ri(n — l)mj/Nj) 



Y: M„(fc + l)exp(2^i^fc) 

k=0 ^ 



(19) 



When this B-spline approximation of the exponential is 
used in Eq. I|17|) , it becomes a discrete Fourier transform: 

S'(mi, 7712,7713) ~ 5(7711,7772,7)13) 

Nat 00 

Mn{uip-ki)Mn{u2p-k2) 



p—1 fci ,/c2,A;3 — — 00 

27r7 



V Ni 
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where 



and: 



xM„(u3p - A:3)e 

= 7Vi3(777l,7r72,7)l3)J^[(3(;i,Z2,^3)] 
i? (771 1, 7772, m3) = &l(771l)62(7)l2)63("l3) 



(20) 
(21) 
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(3(^1,^2,^3) = XI H Mniuip - h - CiNi) 

p—1 Ci,C2,C3 — — CXD 

XM„(7t2p - ^2 - C2iV2)M„(u3p - ^3 - C37V3) (22) 

The array Q{li,l2, h) can be computed quickly (0(iV"*)), 
because it is only non-zero for sub-cubes of dimension 
n X n X n located near each atom. It is because TiQ] 
can be computed with the fast Fourier transform (FFT) 
(which is performed in 0{N log N) operations) that the 
structure factor itself can be computed with 0(iVlog A^) 
operations. 

It is now easily seen how the structure factor algorithm 
provided by the SPME method can be used to yield an 
efficient method for computing V°"(^i, ^2, ^3)- By sub- 
stituting Eq. ((Snj into Eq. ifT^ . we obtain: 



l^'°"(/l,/2,;3) = ^^-' 



P(77ll, 7772, 7)13)5(7711, 7772, "I3) 

x:F[Qihj2,h)]\ (23) 



In practice, Eq. H23|l could be used to efficiently com- 
pute V^*°"(/i, ^2, ^3) with the following algorithm. First 
Q{h-,h,h) is computed via Eq. if^ . Then F[Q] is ob- 
tained via the FFT. J-[Q] is then multiplied by B and P, 
defined by Eqs. ^ and (O. Then the inverse FFT of 
this product is computed and multiplied by N/D,^ yield- 
ing the array V^*°"(/i, ^2, /s)- 

The calculation of the electron-ion contribution to the 
ionic force, dE'^~'^ / dtp can also be made efficient, and 
again this comes from expressing dE^~'^ / dtp in terms of 
the structure factor. In the discrete variable representa- 
tion, the expression for the pseudopotential energy, Eq. 
(0), becomes: 



5" 



^ X pih,l2,h)V"'"ih,l2j3) (24) 



which can be viewed as a dot product of p and 1/*°"; we 
can also evaluate this dot product in reciprocal space, and 
taking into account that p is an array of real numbers, 
the expression becomes: 

mi ,7712 ,7713 

then, using Eq. H23|l for this becomes: 

E'-' ^ N J2 ^[Pih,l2,h)]* P{mi,m2,m3) 

mi ,m2 ,m3 

X5(777l,77l2,77l3)^[g(?l,;2,Z3)] (26) 

Following this substitution, we now express the dot prod- 
uct in real space again: 



E'^-' ~ Y Q{h,h,i3)^-'[:F[p{h,i2,i3)] 

h ,^2:^3 

xP(mi,m2, m3)*_B(mi,m2, ms) 



(27) 



Because the only factor that depends on the atomic po- 
sitions is Q, we can readily differentiate with respect to 
the atomic positions: 



dE" 



dt. 



E 



dQ 



at. 



XP(?711, 7712, 7r73)*i?(777l, 7)12,7773)* (28) 



where a — 1,2,3 is the vector component of the force. 
Eqs. (|23l) and H28|) (and the expression for the stress 
tensor, Eq. l|A3|l ') constitute the central results of this 
article. 

The partial derivatives dQ/dtpa can be evaluated 
readily with the definition of Q, Eq. (|22|l . and with the 
aid of the following B-spline identity: 



d 
dx 



M„{x) = M„_i(x) - M„_i(a; - 1) 



(29) 



Computing the partial derivatives of Eq. H28I) for all 
of the atoms in the system requires an amount of com- 
putation that scales as NlogN. As with Q, the par- 
tial derivative array dQ/dtpa(hj2,h) is only non-zero 
in sub-cubes near the atomic positions, so computing it 
requires 0{N) computation. The fast Fourier transforms 
scale as A^logA^. 

The application of Eq. (|28|l for rapid computation of 
the electron-ion forces F*^^* can proceed algorithmically 
as follows. The Fourier transform of p{h,l2,h) is com- 
puted; then !F[p] is multiplied by P* and B* . Then the 
inverse Fourier transform of this product is found. Then 
by utilizing Eq. 129|l . the derivatives dQ/dtpaihjhjh) 
are computed during the summing over the Z^'s, yielding 
dE''-'/dtpa. 

Similar ideas can yield an efficient expression for the 
computation of the stress tensor. The details of the effi- 
cient stress tensor method are covered in the appendix. 
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Although these methods have been presented assum- 
ing that only one type of pseudopotential (and hence only 
one type of ion) is present in the system, multiple types 
are readily treated. Since the ionic potential is a lin- 
ear function of the pseudopotentials, then with multiple 
types of pseudopotentials the total ionic potential is a 
sum of ionic potentials of the different types, 

vz"{hj2,k)^Y.^r{iij2j3) (30) 

r 

where r indexes the pseudopotential type. The individ- 
ual V7°" are each computed with Eq. H23|) using the 
Q-array associated with this set of ions, and the P-array 
associated with this pseudopotential type. Likewise to 
compute the ionic forces, Eq. (|28|l is used to compute 
the forces on all ions of a given pseudopotential type, us- 
ing the Q-array associated with this set of ions, and the 
P-array associated with this pseudopotential type. 

Several tests have been performed to examine the accu- 
racy of these methods. The accuracy of the approximate 
structure factor, Eq. H20II . increases when the number 
of grid points N increases, and when the B-spline order 
n is increased. However, in an electronic structure cal- 
culation, one is not simply free to choose the number of 
grid points for computing S. It is clear from Eqs. H23|) 
and (|28|) that S and the electronic charge density p must 
exist on the same grid, and typically energy convergence 
considerations dictate a minimum grid density on which 
p is represented. So it must be established that for a 
given grid density, the error incurred by using the ap- 
proximate S when generating and calculating the 
forces (i.e. the present methods, Eqs. and if^ l is 
not much more than the error that would be present us- 
ing the same finite number of grid points and the exact 
structure factor S when generating V^°^ and calculating 
the forces. In other words, it must be shown that the er- 
ror in the total energy and forces that comes from using 
an approximate S is smaller than the error due to using a 
grid of finite density. Furthermore, in order to establish 
that these methods are indeed (quasi)linear scaling, it 
must be demonstrated that for a fixed grid density N/ ft 
and fixed B-splinc order n, the error per atom due to the 
approximate S does not increase when the system size is 
increased. 

All tests here have been done on systems of alu- 
minum atoms simulated with orbital-free density func- 
tional theorv[TlL fl^ . The present methods for gener- 
ating y'°"(ri) (i.e. the method of Eq. ^) and F'^-' 
(Eq. H28II ') were tested as follows: first, in order to test 
the error due to the approximate S compared to the er- 
ror from the rest of the calculation, 32 Al atoms were 
placed in a cubic box S.OSA on a side, and displaced ran- 
domly by about 0.5A from their fee crystalline positions. 
Then was generated using the exact S (Eq. ||SJ)), 
and the present method (Eq. H23|l ) with B-spline orders 
n = 6,8, 10, 12, and separate electronic relaxations were 
done in each of these y*°"'s, yielding corresponding total 
energies. After electronic relaxation, electron-ion forces 
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FIG. 1: (a) Relative error in the energy compared to infi- 
nite grid density limit, Eq. 13H . (b) Relative error in F*^"' 
compared to the infinite grid limit, Eq. II32II . 

F*^"' were calculated using derivatives of the exact S, and 
with the present method (Eq. H28|l 'l. This was done for 
successively higher grid densities. The results are shown 
m Fig. H] The relative error between the energy and the 
converged energy, as measured by: 



(where E°° is the energy in the limit of infinite grid den- 
sity) is plotted as a function of grid density for different 
choices of ]/*°": y-on generated with the exact method 
and the present method. E°° is taken to be the total 
energy evaluated with a grid density of 800 points/ A'^, a 
considerably higher density than plotted in Fig. ^ thus 
the deviation from this grid's energy from the true E°° 
is of a smaller order of magnitude than the energy de- 
viations found at the grid densities explored in Fig. ^ 
making it a suitable energy to use as E°° . It is clear that 
with a B-spline order of n = 10 the error due to the use of 
the approximate is negligible compared to the error 
due to the finite grid density. Also plotted is the rela- 
tive error in the calculated electron-ion forces F*^~* . The 
method used for calculating the force corresponded to 
the method used to calculate 1/'°"; e.g. the data for the 
forces calculated with the present method and a 6th or- 
der B-spline were done with charge densities relaxed in a 
yzon generated with the present method with a 6th order 
B-spline. Thus errors in the forces that were calculated 
with the present method have some error contribution 
from using the present method for generating ]/*°". The 
relative force error was measured as the fractional root- 
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FIG. 2: The relative difFerence between the total energy (cir- 
cles) and forces (squares) calculated with the present methods 
and with the exact structure factor methods, as a function of 
system size (see Eqs. 13111 and 1321 '). 
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FIG. 3: The computation time required to generate V*°" and 
evaluate the F'^~"s as a function of number of atoms, for the 
traditional and present methods. The numbers on each curve 
represent the exponent of a power law fit. 



mean-square deviation of each force component on each 
atom: 

1/2 

y^A'at \^ (p. _ poo\2 

where the F°° 's are the forces in the hmit of infinite grid 
density, in the same sense as expained above for . In 
this case, already at a B-spline order of n = 8 the forces 
are almost as accurate as those calculated with the exact 
method, and for n = 10, 12 the forces are indistinguish- 
able. 

Second, in order to verify the scaling of these meth- 
ods for different system sizes, four different systems were 
considered. For systems of 32, 64, 96, and 128 Al atoms, 
displaced from crystalline positions as before, 1/*°" was 
generated with the exact 5, and with an approximate 
S generated with 8th-order B-splines. The grid density 
in each case was 237 points/A . After electronic relax- 
ation, forces were calculated using the exact S or with 
the present method, again corresponding to the method 
used to generate The relative error between these 

energies and forces as a function of system size is plotted 
in Fig. 121 The relative errors are measured as before, 
with expressions like Eqs. H31|) and H32|l . but instead 
of comparing to the energy and forces of the infinite 
grid density limit, the energy and forces are compared 
to those calculated using the same grid density and the 
exact structure factor of Eq. ||SJ). The relative error in 
the energy is seen to be constant as a function of system 
size, and the relative error in the forces is actually seen to 
decrease slightly with increasing system size. Thus suffi- 
cient accuracy can be achieved with the present methods 
for calculating V^°^ and the F*^"* using a fixed B-spline 
order and grid density, confirming that these methods 
will scale quasi-linearly with system size. 

Finally, the time required to generate with the ex- 
act and present methods, and to evaluate the electron-ion 
forces with the exact and the present methods is plotted 



in Fig. 131 One unit of time in the figure corresponded 
to 1.38 seconds of execution time on a 450 MHz Pentium 
III processor. The present methods are far superior even 
for modest numbers of atoms. It is also noted that the 
present methods are readily parallelized. 

In conclusion, we present accurate and efficient meth- 
ods for computing the ionic potential and the Born- 
Oppenheimer forces on atoms by utilizing an approxi- 
mate form of the structure factor. It has been demon- 
strated that the errors due to the present methods do 
not increase with increasing system size. As is typically 
the case with approximate numerical methods, there 
is a trade-off between accuracy and efficiency with the 
present methods. Accuracy can be systematically im- 
proved by increasing the grid density or by increasing the 
B-spline order n, both at the expense of more comput- 
ing time. However, it was demonstrated that the errors 
introduced by using the present methods at a B-spline or- 
der of n = 10 are small compared to errors in other parts 
of the electronic structure calculation that arise due to 
the use of a finite grid density. Thus a B-spline order of 
10 is recommended as the optimal compromise for simple 
metals like Al, which was used here as a test case. 



APPENDIX A: THE ELECTRON-ION 
CONTRIBUTION TO THE STRESS TENSOR 

The present methods are readily applied to the compu- 
tation of the electron-ion contribution to the stress ten- 
sor. The stress tensor can be calculated with the methods 
of Nielsen and Martin^^. One constructs an ionic po- 
tential skewed by a strain tensor e, and a density similarly 
skewed and scaled to preserve normalization: 

y;°"(r) = y""((l-He)-ir), (Al) 
p,(r) ^ [det(l + e)]-V((l + e)"'r) 
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The electron-ion contribution to the stress density tensor 
is then given by: 



a/3 







d 



p,(r)F/°"(r)dr(A2) 



Under the strain transformation, the structure factor 
is unchanged, and the Fourier components of the den- 



sity, T[p{li, 12,13)], are simply scaled by [det(l -I- e)]~-^. 
The reciprocal lattice vectors, to first order in e, become 
bi ^ (1 — e)b,;, and hence dhi^/deap = —6a-yhip. Dif- 
ferentiating Eq. H2()|) with respect to €ap, one obtains an 
efficient method for the electron-ion stress density tensor: 



J 



n 



mi,m2,m3 



QqQ/3 TO3)J^[p(Zi, ^2, ^3)]*S("ii, rn3)T[Q{li,l2,h)] (A3) 



r 



where Q = m'jbi -I- m2b2 + m'^h^, and 
P'(mi,TO2,m3) = dVP''P{Q)/d\Q\. This method 
has been subjected to simple tests comparing the 
derivative of the energy with respect to the crystal 
lattice constant to the components of the stress tensor. 



These tests indicate an accuracy similar to the present 
force method, ft must be noted, however, that unlike 
the ionic forces, the cell stress has contributions from all 
terms in the energy, and cr^^* is merely one of them. 
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